Data Mining For Business: Everything You Need To Know About Data Mining And Data-Analytic; Learn The Machine Learning To Increase Your Sense Of Artificial Intelligence. by Zak Cameron

Data Mining For Business: Everything You Need To Know About Data Mining And Data-Analytic; Learn The Machine Learning To Increase Your Sense Of Artificial Intelligence. by Zak Cameron

Author:Zak, Cameron [Zak, Cameron]
Language: eng
Format: epub
Published: 2020-08-27T16:00:00+00:00


A graph is constructed to build the signature table so that each graph node corresponds to an element. An edge is added to the figure for any pair of regular items, and the corner's weight is a function of that pair of elements supporting. Additionally, a node 's influence is the support of a given object. Clustering of this graph into K partitions is required to be deterred so that the total edge weights across the barriers are as minimal as possible, and the restrictions are well balanced. Reducing edge weights across boundaries guarantees the grouping of associated element sets. The boundaries should be as stable as possible so that the mapping of the itemsets to each super-coordinate is as well balanced as possible. So this approach transforms the items into a graph of similarity that can be clustered into partitions. You may use several clustering algorithms to partition the graph into groups of objects. Each of the algorithms discussed for graph clustering in Chap. 19 For this purpose, such as METIS can be used. Even the bibliographic notes contain indications of specific methods for the construction of the signature table.

Using the itemsets super-coordinates, the partitions of the data can be described after the signatures have been determined. Every itemset belongs to the partition that has it's super-coordinate identified. Unlike inverted lists, the itemsets are explicitly stored in this list, instead of merely their identifiers. It means there is no need to navigate the secondary data structure to retrieve the itemsets directly. This is why you can use the signature table to retrieve the itemsets themselves rather than just the itemsets identifiers.

The signature table can manage questions with general similarities that can not be handled effectively with inverted lists. Let x be the number of items where an itemset matches a target Q, and y the number of things it differs from the target.

A. The signature table can handle similarity functions of the form f (x, y)

For a set goal record Q, satisfy the following two properties:

o (x, y)

x Around 0 (5.2)

o (x, y)

Such as 0 (5.3)

It is called the property of monotonicity. These natural conditions on the function ensure that the number of matches and the decrease in the hamming distance is increasingly functional. Although the match function and the hamming distance satisfy these conditions, it can be seen that other functions also fulfill them for set-wise similarities, such as the cosine and the Jaccard coefficient. Let P and Q, for example, be the sets of items in two object sets, where Q is the target itemset. Then in terms of x and y, the cosine function can be expressed as follows:

Cosine(P, Q) =, Q)

= = (2 · x + y −) ·

Jaccard, Q =

x + and y

Both functions increase in x and decrease in y. Such properties are important because, in terms of limitations on the arguments, they allow boundaries to be determined on the similarity function. If π is an upper bound on the value of x



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.